Conversation
| /* | ||
| std::vector<array> out; | ||
| for(int i = 0; i < n; i++){ | ||
| auto a = matmulNT(grad_out_reshape(span, span, i), weights_reshape); //Problem is here - can't call () on Variable |
There was a problem hiding this comment.
This line is all that is preventing me from having batches working.
There was a problem hiding this comment.
I can make the matmulXY functions in arrayfire-ml support batches for now.
pavanky
left a comment
There was a problem hiding this comment.
Can you rename Conv2D to Convolve2 to be consistent with arrayfire?
| /* | ||
| std::vector<array> out; | ||
| for(int i = 0; i < n; i++){ | ||
| auto a = matmulNT(grad_out_reshape(span, span, i), weights_reshape); //Problem is here - can't call () on Variable |
There was a problem hiding this comment.
I can make the matmulXY functions in arrayfire-ml support batches for now.
src/autograd/Functions.cpp
Outdated
| for(int i = 0; i < 4; i++){ | ||
| tmp2[tmp[i]] = i; | ||
| } | ||
| auto reverse = Variable(array(4, tmp2), false); |
There was a problem hiding this comment.
reverse is not being used anymore.
src/nn/Modules/Conv2D.cpp
Outdated
| if (b.array().dims(1) != 1) { | ||
| throw af::exception("nn::Linear: Bias must be a vector."); | ||
| } | ||
| dim4 pdims = w.array().dims(); |
There was a problem hiding this comment.
Btw I added .dims() method for Variable. You dont need to do w.array().dims().
| { | ||
| auto res = conv2d(input, m_parameters[0], m_wx, m_wy, m_sx, m_sy, m_px, m_py); | ||
| if (m_bias) { | ||
| res = res + tileAs(m_parameters[1], res); |
There was a problem hiding this comment.
I am not familiar with bias in a Convolution layer. Let me know if you find a reference for this.
There was a problem hiding this comment.
The alexnet model I pulled from caffe's model zoo has both weights and biases for every learned layer.
There was a problem hiding this comment.
You can view biases in this implementation http://www.cs.toronto.edu/~guerzhoy/tf_alexnet/
There was a problem hiding this comment.
@plavin I mean the way bias is used here. I don't know if it is the same as what we are doing in Linear layer.
This still isn't quite perfect. The only thing left to do is to figure out how to make grad_func work with batched output.